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Inverse,  Multiple  and  Sequential  Sttaple  Censuses^ 

Douglas  G.  Chujn&n 
University  of  Washington 

The  enumeration  of  populations  by  sampling  methods  has  only  recently 
come  into  widespread  use,  though  the  idea  dates  back  at  least  to  1783 
whan  it  was  proposed  by  Laplace  [9].  Approximately  a century  later, 
Petersen  Jl3j  used  the  tag  and  sample  method  to  enumerate  a plaice  pop- 
ulation. The  basic  procedure  consists  of  tagging  or  : larking  a number  of 
individuals  within  tho  population  and  subsequently  sampling,  at  random, 
the  whole  population.  Estimates  of  the  opoulation  are  based  on  the  ran- 
dom number  of  tag  recoveries  in  the  sample;  the  number  of  individuals 
ui'-rked  and  the  sample  size  are  known  parameters.  A study  of  such  esti- 
mates and  of  tests  for  the  population  size  has  been  made  by  the  author 


M- 

While  this  simple  model  suffices  for  many  purposes  it  is  apparent 
that  more  complex  models  vi  * be  oft-er  -sarv  or  desirable.  In  many 

cases  the  tagging  and  sampling  is  carried  out  ia  several  stages.  Such  a 
procedure  is  referred  to,  here,  as  a multiple  sample  census.  Various 
point  and  interval  estimation  formulae  based  on  this  procedure  have  bean 
given  by  Schnabel  [17],  Schumacher  and  Essmeyer  jl8],  Da  Lury  £7],  ['7a], 
Scnaeffer  [l6]  and  the  author.  This  type  of  procedure  lends  itself  to  a 
sequential  procedure.  Tests  and  estimation,  formulae  based  on  such  multiple 
and  sequential  procedures  are  formulated  in  this  paper. 


An  even  simpler  modification  of  the  single  step  census  is  to  "invert" 
the  sampling  process  i.e.  to  fix  the  number  of  tagged  individuals  to  bo 
recovered  by  sampling,  rather  than  fixing  the  sample  sire.  Some  formulae 
using  tills  idea  were  given  x’ecently  by  Bailey  M- 

%esear  partially  supported  by  the  Office  of  Naval  Research. 
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2. 


INTOhtE  SAMPLE  CENSUSES 


The  following  notation  will  te  used  throughout: 

N:  the  population  size  being  studied 

ti.:  the  number  of  narked  individuals  in  the  population  at  the  time 

the  i-th  staple  ie  taken 

nr  : the  number  of  individuals  takers  in  tixe  i-th  saruple 

b^j  the  nunber  of  tagged  individuals  recovered  in  the  i-th  sample 
Sjj:  the  number  of  individuals  recovered  in  the  i-th  sample  that 

had  been  tagged  at  the  j-th  tagging. 

For  convenience  we  also  define  n.^,  * and  3g  * o.  The  subscript  1 
may  be  taken  to  run  from  1 to  k.  Where  k Is  1,  i.e.  the  sample  census  is 
conducted  In  one  stage,  the  subscripts  will  be  omitted  for  simplicity. 

If  the  inverted  sampling  plan  is  used  the  are  fixed  parameters, 
the  are  random  variables.  For  the  single  sample  case  n ie  a negative 
binomial  or  negative  hypergeometric  random  variable  according  to  whether 
sampling  is  with  or  without  replacement. 

If  campling  ia  with  replacement,  Pr(n)  th6  probability  of  having  to 
sample  n individuals  to  obtain  a marked  ones  is  given  by 

« «■»-  fry 

and  E(n)  s aS 
t* 


Hence  fii  is  an  unbiased  estimate  of  N,  which,  as  the  author  has 

8 

pointed  out  |4  ] , is  not  true  when  n Is  fixed  and  h the  random  variable. 

The  variance  of  -Si  ia  (N^-Nt)s®^  and  an  unbiased  estimate  of  this  vnr- 
s 

iance  is 


(2 ) cr 
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n-s 
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As  noted  in  Appendix  I 


(3) 


ni 

s - N 
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r -nt 
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is  approximately  distributed  according  to  N(0,1)  for  large  s.  This  fact 

may  be  used  to  set  up  confidence  intervals  and  testa  for  N. 

For  3 and  t botn  small,  as  will  be  more  frequently  the  case,  the 
N 

second  limiting  distribution  given  in  Appendix  I will  be  more  useful.  For 
this  range  of  the  parameters  has  approximately  a X2  distribution  with 
2a  degrees  of  freedom.  This  is  equivalent  to  the  Poisson  approximation  to 
the  binomial  and  is  reluted  to  the  results  of  Scndelius  [15]  concerning 
inverse  sampling  with  random  variables  distributed  according  to  a Poisson 
distribution. 

If  X2  (£  ) denotes  the  £-quartile  of  the  J^2  distribution  with  2s 
degrees  of  freedom  i.e. 

e(*ls  - Xl  <£>>£ 

then  (l-£)  confidence  limits  for  N are  given  by 


(4) 


N 2tn_ 


The  problem  ol  improving  on  these  confidence  limits  in  a procedure 
similar  to  that  treated  in  1 will  be  considered  elsewhere. 

If  the  sampling  occurs  without  replacement 

(5)  ?r(n)=  CnTljl..tl  (N-t)  i..(N-nj  j 

(s-l)l  (tr-s)l  (n-s)i  S'.  (N-n-t-*-s)  1 

and 

(6)  E(  a)  * 

t+1 


so  that. 

(7)  N = _i 

3 

Is  an  unbiased  estimate  of  K.  In  contrast  to  the  direct  sample  census 
estimate,  thi3  iinbiase<inese  does  not  depend  on  the  parameters  s and  t. 


t 


4. 


Furthermore 

(8)  c-1  s 

r.(t  I)  a(t+2)  a 

s 

This  approximate  formula  (which  exaggerates  the  actual  variance)  nay  be 
useful  in  the  determination  of  the  choice  of  as  by  an  appropriate  choice 

of  E 


— can  be  fixed  at  any  desired  level. 

A 

For  testing  purposes  we  note  tnat  N is  approximately  normal  with  mean 
N and  variance  given  by  (8)  for  large  a.  A aodei  similar  to  that  used  by 
David  [b]  can  be  constructed  to  prove  that  as  s tends  to  infinity  n is 
asymptotically  normally  distributed. 

On  the  average  the  inverse  sampling  procedure  is  better  than  the 
direct  sampling  procedure.  For  if  , s* , denote  the  fixed  sample  size 
and  the  random  number  of  tags  recovered  in  the  direct  procedure,  and  if  a 
is  chosen  equal  to 


J&Lt  , 
N 


then 


E(n) 


At 

4 


2 N* 
while  cT^  < ~ 
N 8 


< n' 

M’(bi)<triet jusai1 

v y s'tl 


tiie  almost  unbiased  estimate  in  the  direct  ounpling  case, 

s'vl 

Hence  a mere  efficient  estimate  is  obtainec  with  less  average  effort. 

On  the  other  hand  if  the  experimenter  knows  absolutely  nothing  about 

the  possible  population  size,  then  by  an  improper  choice  of  t and  s,  E(n) 

may  be  extremely  large.  Moreover 

C T"^  3(Ntl)  (N-t)  ( t-s-e-l)  j.  a Aj\  2 

n (t+l)2(tv2  ) (l) 

is  very  large;  this  may  be  regarded  as  an  undesirable  feature  of  the  pro- 
cedure. 


r 


5. 

These  dim culti.ee  may  be  partly  overcome  by  a modification  of  the 

inverse  sampling  plan  as  follows:  the  number  of  untagged  individuals  to  be 

recaptured  is  predetermined , rather  than  tho  number  of  tagged  individuals. 

In  other  voids,  n-s  is  chosen  in  advance  of  sampling:  c and  n are  now  both 

random  variables,  though  completely  dependent.  For  convenience  write  n-s  * £. 

(9)  Pr(n)  * (n-ljl  ,.Vt(K-.t)}  («rP)l. 

(n-«ni  lt-n+6  ) I ( £ -1)1  (B-W  ) » Hi 

is  derived  in  the  usual  manner.  However  the  obvious  estimate  again  is  no 
longer  strictly  unbiased.  For 


(' 


**-6+1 J 

ni  r-1 

((d-J+I)  1 ( t-t-l-n-W  ) l (£-1) ! (N-l-tfl-^) i (Nfl)  Ij  t*  1 

This  result  is  analogous  to  that  obtained  by  the  author  in  [4l?  by 

the  same  argument  and  formulae  given  there  (pp.  L4 5-146)  it  follows  that 

the  second  term  in  the  parentheses  is  negligible  provided  log  H. 

M 

For  such  values  of  ^ , t,  N the  estimate 

(11)  H - _ 1 

stl 

has  bias  less  than  1 in  absolute  value. 

To  determine  the  variance  of  N,  (stl)”  is  expressed  in  an  inverse 
factorial  series  as  in  £4}  so  that 

(N.i)2  =(trt)2  [■>(«•«-»]  * (b.iHsW(6.3) 

To  evaluate  the  expectation  of  tills  latter  series  let 
n(i)  * n(n-l)(n-2)  ...(n-i+-l)  and  observe  that  for  a r«y  i£  j (j-i-6  ) 

(12)  E fl^i=i>LLL )=  (^0^  (1  , 


1 


6. 


Here  7f^  stands  for  J terns  of  a form  similar  to  the  one  in  the  brackets 
of  formula  (10).  This  formula  is  derived  by  & direct  summation  exactly 
parallel  to  that  used  in  obtaining  ( 10) . 


If  7£j  is  neglected  and  the  remaining  terms  on  the  right  hand  sido  of 
(12)  are  written  them 

(13)  <r£  * E(H)2-*2  » (tvl)2  [U22t42^2q^. . . .)J-2N-2-S‘ 

Table  1 gives  a tabulation  of  — for  various  representative  values  of  N, 

n 

t and  obtained  by  means  of  this  formula. 

The  average  sample  site  required  by  this  procedure  can  be  determined 
simply  from  (9).  In  fact  again  by  direct  summation 

(U)  E(nri-l)f  v.  («»l)i 

(N-Ui^ 

ao  that 

(15)  E(n)  * 


Either  of  these  formulae  can  be  obtained  from  (6)  and  (8)  by  an  inter- 
change of  N-t  and  t and  by  replacing  s by  6 . It  is  seen  immediately  that 
tha  tremendous  variation  possible  in  the  earlier  inverse  sampling  model  is 
now  eliminated. 

Since  6 will  usually  be  reasonably  large  the  most  appropriate  approxi- 
mation to  use  for  testing  purposes  is  the  normal  distribution.  In  particular, 
it  is  desirable  to  work  with  n using  formula  (15)  and  (16).  Writing 


p»  and  using  a trivial  modification  of  the  approximation  of  (16) 


7- 


(n  - -X-) 

— 1~2.'  ' is  approxima  teiy  N(0,.l) 

/ jLl. 

v/  1-p* 

and  confluence  finite  for  p‘  (and  nence  N)  are  obtained  from  the  t,uadratic 
equation 

(17)  [n(l-p')  -t)]2  - k2  a 'p’d-p'). 

The  approximation  :uay  be  improved  slightly  if  the  exact  for  tula  for  C7"^  ia 
used  but  this  involve®  solving  a liigner  decree  equation. 

DIJvECT  . 1ULTIPLK  SAMKuE  CENSUSES 

In  tne  lost  usual  type  of  direct  multiple  sample  census,  at  each  stage 
a group  of  individuals  are  drawn  from  tnc  population.  Those  that  are  tagged 
are  noted;  tli^se  mot  tagged  are  tagged  and  then  tne  whole  group  returned  to 
the  population.  Tnis  sampling  may  take  place  without  replacement.  Even 
where  tne  sampling  is  with  replacement  it  :aay  be  desirable  to  set  up  the 
experiment  so  that  tne  model  appropriate  to  sampling  without  replacement 
ie  correct.  This  may  be  done  by  ignoring  recaptures  in  the  sane  sampling 
perioc.  It  is  frequently  desirable  to  do  so  tc  avoid  possible  nonrandxuness 
involved  in  such  recaptures. 

For  this  situation  the  appropriate  model  is  given  by  the  following  con- 
ditions 

a)  VrV^  V8i  l»  k 

b)  t^(-n^) , n^,  n^  are  parameters 

c)  t~,  t • • • # 


t^;  Sj  , So,...,  Sj,  are  random  variable* 


» *HI 


4 


“I 


(18)  Pr^,  s2,...>  a^) 


/t.  \/N-t.  i 

k ' 1 1 ! 

T,  b n -s  / 

V >A  1 iy 


i*1  <;) 


k r^J  (N-n^) 

= I | N1  Sli 
is  1 


(N-Aq) 


The  remaining  tens  telescope  becfru.se  of  (•).  Fruu  formula  (13)  It  is 
apparent  that  is  a sufficient  statistic  for  N.  Moreover  the  naximuni 


likelihood  estimate  of  N is  easily  derived  from  (13)  ^ aee  Ap.>endix  IlJ. 
It  is  the  solution  ol  the  equation 


(i9)  FT  ; 1 - ^i)=  i - .btii 

U 0 V N / N 


In  view  of  the  fact  that  N is  nuch  larger  than  any  n^  a first  a pur oxida- 


tion to  the  maximum  lik'lihood  estimate  is 


(20)  n,  r mr  y 


°i  nj 


t>  i 


This  formula  la  obtained  by  ignoring  terns  involving ^ or  higher  powers 

of  -L.  In  (19).  If  the  cubic  terns  are  retained  the  approximate  equation  to 
N 

be  solved  for  the  maximum  likelihood  estimate  is 


(21)  /'S±  Z±Znl0j)  + 

\i  --- 1 7 1 1 o j*i*i  y 


r^+i 


■n<r^y 


The  larger  root  of  this  equation  (which  will  be  denoted  N^)  is  the 
desired  one.  As  shown  in  Api>er.dix  II,  under  certain  conditions  that  will  be 
frequently  met  with  in  practice,  the  maximum  likelihood  estimate  will  lie 
between  and  N.  and  furthermore  the  difference  N. -N  wil_  be  relatively 

4 i*  L Cj 

small  compared  to  K. 


ESflRfMHr 


9. 


It  is  intfiresil nr;  to  note  that  t.ie  some  estimate  i3  obtained  by  trie  n'  tnod 
k_ 

of  moments.  For,  if  y Sj  ts  denoted  1:^.,  then 

i ® 1 


(22) 


E<S*> 


where  *;  sum  of  proouets  of  nQ,  n^,  n^,...,  n^  taken  i at  a time  with  ail 

subsciipts  different.  Either  of  these  results  is  most  easily  obtained  by  in- 
duction. 

The  cistributlon  of  f»K  is  difficult  to  obtain  in  any  simple  u9eaDie  form, 
so  that  it  is  difficult  to  evaluate  the  small  sample  properties  of  tnese  esti 
mates.  It  may  be  r.ote-d  that  these  estimates  are  not  otrictly  comparable  with 
those  of  bchumacher  and  Essmeyer,  Schnabel  ana  tne  author- s own,  referred  to 

p 

in  the  introduction.  These  latter  estimates  which  are  generally  of  a ^ type 
are  derived  on  the  assumption  (implicitly  or  explicitly)  that  the  number  of 

tags  is  a fixed  parameter,  not  a random  variable. 

It  is  jXDSbihic  to  write  down  simply  an  unbiased  estimate  w.nich  is  valid 

in  both  of  ti.ese  models  (i.e.  whether  tne  t are  random  variables  or  para- 


are  sufficiently  large  j^rourhly, 


meters)  viz 

k 

(23}  N*  m I > 

M.i 


(nif  i)(y  i)  _1 

s^-e  1 


n.  t. 


provided  the  s.  (or  more  precisely  _i_JL  ) 
1 71 


near  10  or  greater  in  sizej.  For 


E/):t  II  E 

K i-=l 


(n^  1)  (tA  1) 


s,+  1 


= -j-  ii  e w = * 

i-i  tj 


10. 

under  the  restrictions  noted  (those  fiver*  in  f 4 j p.  Li?). 

As  a consequence  of  a thoorer.t  of  bi.aok.veil  however,  it  car  be  said 

t-iiat  an  unbiased  estimate  for  M based  on  hts  a smaller  variance  than  if*, 

within  the  model  in  which  the  t^  are  cumulative  and  t<  + j - t^  = n.  - s^. 

Vhere  the  t<  are  nox  random  variables  but  fixed  parameters,  If*  la  a 

desirable  estimate  (subject  to  the  restrictions  not  too  small)  and  for- 

N 

theraore  the  variance  of  U*  may  bo  computed  from  formula  33  of  1 4 j and  the 
fact  that 


(24)  O- = -1- 


1 V 

Z*  2 — 


i 1 


s^  -*  1 


n<  t« 


It  will  frequently  happen  that  the  *1  1 r.re  too  snail  to  pemit  N*  being 

N 

an  unbiased  estimate.  For  thi3  situation,  where  tne  nre  fixed,  a slight 

modification  of  an  estimate  given  by  ScJinabel  jl7 j will  be  most  useful  viz 
k 

(25)  hp  — i=l 


ni\ 


(i±  X 

V i =1  7 


This  is  based  on  the  fact  that  each  s^  lias  a pproxirua teiy  a Poisson  distribv.- 

n t k 

tion  with  perameter  * and  hence  t s has  aDoroxinately  a Poisson  dis- 

N i = 1 

_k_ 

tribution  with  parameter  V n t _ ^ (Sby). 

i = i 


E(Np)  — N(i-e  ^)  is  easily  derived.  Furthermore 

Ne"A  wil1  be  certainly  negligible  if  ^t.  > N (log  N).  This  is  a 

i*l 


flNSsswa 


11. 


ffi’ucn  lesser  restriction  than  that  the  same  inequality  hold  for  each  n^tj.. 

Neglecting  terms  of  the  form  /\^e“  1,  2,  3)  an  approximate 

formula  for  the  standard  deviation  of  Np  is  derived,  viz., 


(26) 


c-ju- 


1 r 


\i 


/&-- 


+>*; 


The  derivation  ia  exactly  parallel  to  the  derivation  of  (33)  in  ^4j, 
tnreugh  the  use  of  an  inverse  factorial  series. 

Many  sample  censuses  will  not  fall  strictly  into  either  of  these  lodeis: 
soi.-.e  of  tl.e  unmarked  lndivlchiala  taken  in  a sample  will  be  returned  as  narked 

members,  but  not  all.  A full  treatment  of  this  case  is  clearly  complicated. 

are  small,  tne  estimate  Np  is  probably  quite  satisfactory. 

The  Poisson  approximation  will  also  be  useful  as  a basis  of  constructing 
tests  and  confidence  intervals  for  N.  The  confidence  limits  given  in  may 

clearly  be  used  (with  nt  replaced  by  ) if  the  t*  are  fixed  oaraneter6. 

■rrVi 


t . 

Wh<  re  the  ■ A-  A 

n 


i-  1 


N 


Within  the  model  primarily  considered  in  this  section  tills  will  probably  be 
reasonably  satisfactory  also. 

If  the  experimenter  observes  s^j  the  nixaber  of  tags  recovered  in  sample 
i that  were  placed  in  the  J-th  tagging  rather  than  simply  s^^  (a^  — 

J*0 

more  Information  is  apparently  available.  Actually  the  knowledge  of  s^j  is 
of  no  value  in  estimating  N,  or  in  constructing  tests  for  N.  For  tne  joint 
probability  of  the  is 


'rorn  •one  fora  of  this  probabi  *.i  t/  aiutribi).  co  it,  i c<v  chat 

i\.  i*.  1 • • .1 

V 


— ^ 3i  ~ 1 *s  s^ili  n sufficient,  rtitistic  for  I.'. 

t — l i s*  r j •*=■  o 

Howeve : toe  Oj  j :r.ay  be  urea  to  test,  the  assumptions  that  the  sampling 
is  random  (l.e.  the  probability  of  including  an  ar.iir.al  an  th?  sample  5.8 
indspendont-  cf  its  being  tagged)  or  that  the  population  is  constant. 

Under  the  hypothesis  tested,  that  N is  constant,  giver  t<  (j  = i,  l,2,...,i-l) 

u 

the  e^j  have  a multihypergeometric  distribution  or  neglecting  the  sampling 

without  replacement  a mui  cinorrlal  distribution,  i.ow^ver  the  usual  **  test 

does  not  follow  immediately  for  the  estimate  of  JJ  will  be  determined  from 

the  randan  variables  s^  which  are  not  independent  for  different  i.  The 

»”) 

The  author,  however,  conjectures  that  in  view  of  the  nature  of  the^*’  test 
this  is  asymptotically  negligible.  Mach  more  serious  in  the  fact  that  in 
many  cases  .it  least,  E(3.,  |)  arc  too  small  for  a reasonable  application  of  the 
test. 

As  an  alternative  procedure  a ncr.- parametric  test  is  suggested.  An  array 
of  the  form 


a10  . 

f2 Ox 

f 2£L»  . . . , 

8kC> 

nln0 

nVo 

n3c0 

Vo 

S2i  „ , 

f2i  , 

°kl_. 

n2(nl“sl} 

n-U^i 

n^Tn 

VVrVi* 

mav  bo  formed,  in  which  each  element  is  a randan  variable  with  expectation  — . 

N 


i v 


Within  otter-  row  the  rand  on  variable.1:  are  Independent-  between  rc-wa 
this  is  not  the  case;  for  tne  j ? for  fixed  1,  are  observations  free  tho 

same  sample.  However  the  correlation  between  any  two  s.  , , sev  3.  , s.  is 

ij5  * iq  * ig 


- .V  2W.  N-6- 

N 


ML  _1 

N N-l 


n.(n  - s - ) 

which  will  be  usually  negligible.  Moreover  il‘  -= — - d — 


la  small 


compered  to  n^  the  conditional  probability  of  aiy.  given  is  almost 

equal  to  tne  unconditional  probability  of  s^  . 

In  viev  of  this  the  sign  test  suggested  by  Moore  and  Vaili3  [l^J  to 
test  for  rar.doBUiese  in  a sequence  of  independent  observations  from  a common 
dietributior  may  be  appropriate.  The  test  is  based  on  the  statistic  D,  the 
cumber  of  n<  gative  signs  in  the  sequence  of  successive  differences  of  ob- 
servations. Moore  and  Vullis  tabulated  tne  probability  distribution  of  D 
for  small  vrlues  of  n,  the  miaoer  of  observations.  They  conjectured  and 
Mann  J^ll  ] si  bsequently  proved  rigorously,  that  D 13  asymptotically  normally 
distributed.  In  application  of  tne  test  since 


e(d)  - 

<e 


and  G"^  - 


/D  “ 

U'  J 12  is  taken  to  be  distributee  according  to  W(0,  1)  for  n — 12. 

n*l  1 

\ 

If  the  array  (28)  is  considered  as  a single  sequence  of  observations  the  n of 

ic  ( k *4*1} 

th9  Moore  ard  Vallis  test  is  uere  equal  to  — * — — ' . 

In  many  case?  the  alternatives  to  randenress  aro  essentially  one-sided. 

For  example,  some  that  r.if*v  b«  considered  are: 

(a)  the  tagged  individuals  die  off  more  rapidly  or  disappear  so  as 
not  to  be  available  for  sampling 

(b)  the  tagged  individuals  disperse  from  the  tagging  Location  3lovly 
and  are  more  likely  to  be  recaptured  in  tho  sampler  taken  soon  after  the 
tagging  ratrer  than  later 

(c)  there  is  a narked  change  in  the  composition  of  the  population 
due  to  natural  processes  or  to  migration. 


ir- 


rnaanD^uuMt 


U. 

If  any  of  these  alternatives  is  true,  the  random  variable*  in  each  row 
of  the  array  will  tend  to  increase.  In  thic  case  a t at  based  on  the  wnole 
array  as  a single  sequence  has  the  following  defect*  if  the  alternatives 
are  true,  in  each  row  tne  probability  of  a negative  difference  is  greater 
tnar.  but  tne  probability  of  a negative  difference  between  the  lust 
element  of  ai\y  row  and  the  first  of  the  next  row  will  l>e  much  less  tnan 

To  avoid  this  it  ia  neceasary  to  consider  each  row  separately,  i.e. , 
the  array  (28)  nay  be  considered  as  k sequences  of  observations  decreasing 
in  length  from  k to  1.  A test  of  randomness  may  be  nade  using  tne  sun  of 
the  number  of  negative  differences  in  these  k sequences  (actually  k-1  since 
no  difference  ia  obtainable  from  the  last  row). 

Let 

(29)  I - D1  t D2  r . . . -r  Dk-1 
where  »■  number  of  negative  differences  in  row  i. 

Then  E(X)  - and  j~2  - . 

The  asymptotic  normality  of  X ia  immediate  using  the  fact  that  ^ itself 
tends  to  be  normally  distributed  as  k tends  to  infinity  widls  the  initial 
terms  of  tho  sum  (29)  are  asymptotically  negligible  as  k becomes  large. 

Table  II  gives  a partial  tabulation  of  tne  distribution  of  X for 
k -4,  5,  6,  7. 
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INVTCHSli  MJbTIPii-:  SAM? Li  CENSUSES 

As  in  the  single  stage  census  '.wo  . lodels  ::wy  b considered.  In  the 
first  of  these,  sampling  and  tagging  ere  carried  out  in  exactly  the  name 
manner  as  in  the  direct  multiple:  sanple  census,  except  that  tie  number  cl 
tagged  individuals  at  each  stag  * is  prede tormir.ed,  rather  thai  the  sample 
size. 

In  tills  model 

a)  t^  i “ = n-  “ s-  2,...,  k 

b)  t1(  - Hq)  , 3l?  32,...,  ^ arc  pa.vatic  tern 

c)  tj,  t^ , . . . , , n^,  n2».-.,  Qjc  are  random  variables, 

and 

nk)  = TV 

1 = 1 (s1^l)l  ( tL-:3i) ! l s±) I 


There  is  now  no  non-trivial  sufficient  statistic  for  b . however,  rn 
unbiased  esti.iate  is  easily  found,  namely 


(30) 


jj.  i f pi(*H-  l)  _ 1 


k 1 = 1 


for 


(31)  E(fi)=  JZ  £ 
K i-1  n 


E ( Pi-i-L.'t~  U _ l 


n. 


nj*  L) 


i 

j < i 

r i £ E [kJ  = K. 

1=1  ^ 
j<  i 

✓ s 

Using  the  approximate  fomvla  for  tie  variance  of  11  in  the  single 
sample  ca3e,  (3),  by  a similar  procedure  oo  the  derivation  of  (31)  it 
is  found  that 

(32)  G*~  = & T~  . 
h k2  i~l  Ri 
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In  this  inverse  sampling  procedure  the-'e  are  several  ;2irantetars  at 
the  disposal  of  the  experimenter  - t\q,  a and  s^,  a 2,...,  sjj.  Im  iaay 


be  desirable  to  chouse  tnem  so  as  to  minimize  E 


(J  N 

N 


£-‘)  ■ 


idle  molding 


fixed.  This  would  achieve  a fixed  precision  of  estimation  while 


minimizing  the  effort  expended,  ilowever  after  arbitrarily  fixing 
«nd  k,  the  optimum  choice  of  the  necessitates  the  solution  of  an 
algebraic  equation  of  high  degree  (the  degroe  increases  rapidly  with  k) 
which  involves  1»  in  a complex  faehion. 

In  lieu  of  general  rules  Table  III  gives  the  properties  of  a number 
of  simple  designs.  The  approximate  formula  (32)  was  used  to  calculate 


GH 


E(n^)  was  calculated  recursively  fr<xn  the  formula 

. j.  (»  * m==i 

K(n^)  — 

2Z  E(n.)  - Bi  + nQ  t 1 
j=l  J J 


In  order  that  t..is  give  e reasonable  approximation  it  is  necessary  that 
be  not  too  small.  Fox  this  reason  some  of  the  more  interestine  cases 
with  the  initial  Sj_  very  small  are  excluded. 

In  tiie  second  model  dealing  with  the  invoree  sampling  procedure,  at 
each  stage  n^-s^  is  predetermined.  As  before  we  write  n^-E^  Thus 

a)  ti i — t^  — & ^ (i^  0>  !>•••»  k) 

b)  s^,  S2»...,  s^;  n-^,  rr,,...,  njc  are  random  variables. 

Since  t^  is  a parameter  at  each  stays  of  the  sampling  procedure 
there  is  now  independence  between  the  successive  random  variables.  Con- 
sequently the  results  determined  in  tiie  single  sample  case  are  easily 
generalised  to  this  model.  Under  tiie  restrictions  on  n and  6 tii»  t (11) 
hold,  it  follows  tiiat 

(33)  f = i t Miiiil . i 

k i=i  1 

is  an  alnoct  unbiased  estimate.  Also  the  variance  Is  deter  .lined  from  (13) 
and  the  usual  formula  for  the  sum  of  the  variances.  Similarly 
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(34)  E(n) 


= 


+ 1) 


tlN-ti 


f 1 


is  derived  from  (15). 

It  may  be  mentioned  that  a .uaximum  likelihood  estimate  nay  be  de- 
rived for  this  .nodal  wldch  is  analogous  to  that  given  ty  formula  (19). 
For  again  a telescoping  occurs  in  the  formula  for  the  joint  probabili- 
ties of  the  random  variables  viz. 


7*T 


(n1-l)!(ti-l)I(N-ni)l 

N-tj 

(ni“^i)  1 < Vni  + ^i)  ’•  (cfA-l)  IN*. 

The  maximum  likelihood  estimate  of  U is  the  solution  of  the  equation 


where  we  write  n^  for  t^.  In  triia  case  tnere  is  no  simple  sufficient 
statistic!  the  maximum  likelihood  estimate  is  a function  of  all  the  n^. 
Since  this  is  so  and  since  the  solution  of  (35)  lias  a complicated  distri- 
bution, the  estimate  17  seeraa  more  desirable. 


SLqUEh'i'IAi.  1 LSI'S  FOR  H 

The  optimum  sample  census  procedures  evidently  are  sequential: 
furthermore  the  sequential  procedure  should  permit  a choice  of  the  design 
at  any  stage  rather  than  merely  a choice  of  whethor  or  not  to  take  fur- 
ther observations.  For  simplicity  only  standard  sequential  procedures 
are  considered  here,  primarily  in  relation  to  tests  for  W.  Guch  tests 
may  be  useful  in  control  and  nanageaent  pro  oleins. 

For  direct  sample  censuses  with  t(s^)  small  we  consider  rq  as  pre- 
assijned  and  let  k the  number  cf  sta0os  in  t.ie  census  be  a random  vari- 
able. Since  the  s have  approximately  a Poisron  distribution,  Weld's 
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theory  [9]  is  applicable  in  a routine  manner. 

In  particular  to  teat  the  hypothesis  rl:  1<  — Nq  against  trie  alter- 

natives « > Nq  at  a lave!  of  significance  c(  , so  that  the  power  at  a 
particular  alternative  Nj  is  1 - /3  we  proceed  as  follows: 

at  any  stage  j after  n^,  n2,...,  nj  individuals  have  been  sampled 
and  a^,  a2 »•••»  Sj  tags  recovered 


J 

(36)  Reject  il  if  TL  ^ ^ 
i=l 


J ^ 

(37)  Accept  H if  s — 
i=l  1 


log  A 


- 1 HH  U - 

. 1-nj 


L 

Nrt  N, 


log  Nq  - log  N-^ 

I02  B - E nltl  (x.  - X 


i=l 


hQ  hl 


xog  Nq  - log 


£■* 


Continue  observations  if  Bj  lies  between  «,.»*>  bounds  In 

i-1 


(36)  anu  (37). 


Here  A = 


._h£ 


B = 


_ j6 


i-<r 


The  bounds  in  (36)  and  (37)  are  tnensea-vos  random  variables:  this 

is  so  because  trie  are  dependent  in  the  model  considered.  ’ owever 

trie  conditional  distribution  of  given  s^(i  <■  j)  is  of  tr. a Poisson 

form,  approximately,  with  parauoter  n.lAl  . The  suT.tin<:  up  of  the  se- 

N 

rjuential  probability  retio  test  docs  not  re  <uire  independence  of  t!ie 


observations:  it  depends  on  tn«  conditional  distribution  of  the  ob- 

served random  variable  at  each  sta.,e.  .iowever  tne  various  optimum 
properties  and  associated  results  t.:at  Va^t.  unr.  otieis  nave  proved  lor 
the  SB' uentin*  prolxibility  ratio  tost  are  not  necessarily  valid  with- 
out independence.  However  approximate  formulae  could  00  determined  for 


most  sitm  tions  from  tne  for.-.uiae  for-  tie  operatir..;  characteristic  curve 
and  for  E(n)  in  the  standard  Poirson  situation.  Some  oof  t-iose  nay  be 

found  in  [S], 


.9. 


In  conclusion  it  nay  bo  reiterated  that  the  sample  censuses  studied 
here  nave  been  li  dted  to  situations  where  b..e  population  is  essentially 
stationary.  This  excludes  po;iulations  '.'here  any  substantial  in  deration 
or  eiigration  occurs.  It  does  not  necessarily  exclude  po^uxations  which 
are  changing  due  to  natural  causes  e.g.  birth  and  deuth.  Ii  those  bo.Ti 
subsequently  'co  lac  initial  tagging  operation  can  be  distinguished  fran 
the  rest  of  the  population,  without  undue  effort  ana  if  the  death  rate 
is  the  sane  onong  the  tagged  and  un tagged  groups,  then  the  sample  census 
yields  valid  information  on  the  population  cia-e  at  tiie  tine  of  first 
tagging. 

For  suppose  the  probability  of  sui-vival  from  tine  of  tagging  to 
tiuo  of  sampling  is  p.  how  consider  the  almost  unbiased  estimate  in  the 
direct  single  sample  census  model,  wiiere  at  tho  tine  of  sampling  t’ , 
(N-t) 1 = u' , N'  actually  are  surviving. 


Since  t' , u'  are  independent  and  since  it  is  reasonable  to  assume 


they  nave  a binomial  distribution 


(38) 


r—  1 

i fa  } - 

1 + fr.-tly- 

L s +1 

(t-hl)p 

= N 


(ttl)  - 1 


The  denominator  in  the  right  liand  side  of  (38)  comes  from  lie  fact  that. 


by  direct  summation 


for  large  t,  and 


moderate  p. 


20. 

GL:J.larly  ti.e  unbiased  estimate  w = — of  t r.v  inverse  sample  census 

& 

is  unbiased  whether  or  not  mortality  occurs  in  tne  population  (provided 
tugged  arid  untag, pad  are  proportionally  affected) . t-ven  tne  variance  of 
tnese  estimates  is  only  siigntly  modified  unless  tno  mortality  is  exces- 
sive. For  example,  the  approximate  fomuia  (3)  now  becomes 

2 

which  ia  approximately  iL-  unless  p is  verv  small . 

e 

However  if  there  is  natural  mortality  the  multiple  sample  census  nay 
be  seriously  affected — estimates  of  L in  such  a case  may  be  meaningless. 

In  fact  tliis  forms  a basis  for  the  possible  estimation  of  natural  mortal- 
ity in  an  animal  population — a fact  utilized  by  Leslie  and  Chitty  in  a 
recent  paper  (loj. 

The  references  cited  below  are  those  referred  to  in  the  body  or 
appendic  of  the  paper.  Uo  attempt  has  been  made  to  coupile  a complete 
bibliography,  even  of  the  methodology  in  this  field.  In  fact  nowhere  is 
such  a bibliography  available  apparently;  however  tne  nonograpns  of 
Kicker  [14J  «uid  Sciiaeffer  [l63  give  a large  number  of  useful  references. 
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Appendix 

I To  show  the  asymptotic  normality  of  z as  defined  in  formula  (3) 
consider  the  moment  generating  function  of  z: 

Pa  1 6 a P 1_  g 

Mz(0)  = p8e  a V1-?  .1  - (l-p)a 

by  routine  algebra  and  the  usual  lanipulations  it  may  be  seen 
9 2 

that  jiff  .-1_(0)  tends  to  as  s tends  to  infinity  (e.g.  cf.  Craner  5 
* 2 

PP.  193-199) . 

On  the  other  hand  the  moment  genera  tin,’,  i unction  of  the  random  var- 


iable 2np  Is 


•‘2nr(d)  * P8®2p<*  L1  - U>P)^J-*S 

and  in  , l2np^>  ~ 2pe0  - s in  [l  - 20  h(p)] 

wiiere  h(p)  -*  0 as  p 0. 

lim  .1^(0)  = (i  - 29)"® 

p -*  0 

which  is  Lie  .lament  generating  function  of  the  X2  distribution  with  2s 
de.;rees  of  freedom. 


II  maximum  Likelihood  filiation. 

Tue  laximua  likelihood  estl\ato  ic  derived  by  setting  the  ratio 

<*> 

rrv *2*'  * ’ * 


(40)  v,(N)  - PJ  q.(N)  where 

i=  1 1 

. . . (h-m)  (N-t.i) 

(a)  c.  (h)  = 


2 4. 


and  alco 


(42)  («)  = 


1 ‘*2 

- + — - _ 

,2 


• i-j 


ip 


0 


k 

hk 


1 + 


^ ^ ( i)k^ 

« ^ • • • \ x / ^ 


1 + 


g!__»!3 


N 


1 - 


ill 


1 ■*■ 


JL 


R 

1 -T 


30  tiiat  .,(:•)  ^ 1 according  as 

(«)  ii_h 


also 


(44)  m(N)  ^ 1 according  as  Ns^  ^ 


let  n * ain  *1*^.4 
1 si 


.1  = nax  "iH 


lu  view  of  (40)  and  (44)  the  roots  of  (19)  aust  lie  Letween  tn  and  .i.  If 

one  or  .;ore  are  zero  i-1  will  not  be  finite,  however  unless  all  are 

zero  (and  hence  = 0)  it  Is  eviaent  l'ron  the  forn  ol  equation  (43)  tiiat 

h(N)  < 1 for  sufficiently  large  ii.  ==  0 aay  De  ne0l acted  since 

PrtS^sr  J)  will  be  extreaely  aiiall. 

It  is  possible  tiiat  equation  (19)  has  several  real  roots  in  the 

interval  (n,  li) . Those  for  which  «.;(h)  is  uscreasing  represent  local 

naxina  and  one  of  theso  the  extreme  naxiiaun.  The  large  nuriber  ol  paraneters 

gives  rise  to  a diverse  nuaber  of  possibilities  but  since  ir.  general 
k 

N > > E tiie  following,  theoron  will  cover  :,iany  cases  that  .aay  arise 
i'-l 


sn 


Kj 

Theorem.  If  for  K > m,  — is  n monotone  decreasing  sequence  and 

N-"1 

Jl 

i-l  J fh 2 I 

ii  — — < a,  K3  </  2J  , then  tue  maximum  likelihood  eetL.iate  iJ 
satisfies  the  following  inequalities 


Nq  < Nw  < Nl 


Proof- 


k_ 

£i‘i 

since  the  left  hand  side  is  a wei^nted  harmonic  aean  of  terns 


and  hence  larger  than  the  smallest  term 


nih 


I C» 

L ^ 


For  li  > N^»  ^ and  - ~ - (-1)  y^r  < 0 

N N2  N*-1 

so  that  Nl  > N* 

Since  > 0 Nq  < NL 

If  Nq<  m then  Nq  < K*.  Consider  the  contrary  case  Nq  > 3 and 
denote  the  other  root  of  the  quadratic  equation  (21)  by  Nq  . 


k 

T n. 


Ro 

N'  = 

q 2Sk 


I J ki  A r s. 


[the  first  inequality  follows  fro.  1 the  fact  that 
1 - (1-X)^  < X ] 


1 


26 


Hnnce  for  a < K < N 


Kp  Uo 

Sj.  - <0 

* K n2 


.**  ac  < a*. 


;loreover  i 


h, 

and  — * 


nJi-L 


k 

rDi 


t;  * t; 


provided  only  h 


><(?)' 


The  rifjht  .land  side  of  tldo  equation  is 


approxi-uatcly  of  the  order  iL 

ni 


i 
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Table  I 


N unerical  Comparison  of  Certain  Inverse  Sample  Censuses  with  f (number  of 
tagged  individuals  to  be  recaptured)  fixed  and  with  d (number  of  untogged 
individuals  to  oe  recaptured)  fixed. 


r- 

1 

2 

3 

Cr84 

N 

cr«5  C7"v6 

N 

t 

A 

3 

E(n) 

° 1/6 

CT, 

n/a 

a. 

N 

JEL 

N 

exact 

approximate 

100 

S90 

10 

1000 

3.2 

316.2 

.30 

• 32 

.33 

104 

100 

4950 

50 

5000 

7.1 

707.1 

.10 

.14 

.10 

500 

475 

25 

500 

5.0 

1G0.0 

.19 

.20 

.20 

105 

100 

4995 

5 

5X0 

2.2 

2236.1 

.43 

.45 

.55 

500 

4975 

25 

5X0 

5.0 

1X0.0 

.19 

.20 

.20 

1000 

9990 

10 

10,  OX 

3.2 

3162.3 

.31 

.32 

.35 

103 4 * 6 

1000 

19,900 

20 

20,000 

4.5 

4472.1 

.22 

.22 

.23 

1.  < 

S r ud 

s were 

chosen  so  that 

E(n) 

is  the  sane  (to  the  nearest 

integer)  for  both  sampling  plans,  i.e. , predetermining  s or  predeter- 
mining & . 

2.  Calculated  froi  the  approximate  foraula  <J~  - { AlL 

n V N-t 

3.  Calculated  from  the  approximate  formula  07,  - W *£. 

v 

4.  Calculated  from  formula  (8)  for  the  estimate  where  e is  predetermined. 


(the  approxiaetion  to  formula  (8)J . 


6.  Calculated  from  formula  (13)  for  the  estimate  where  v xe>  ^redetermined. 
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i 

\ 

{ 


r 

' 

k 


4 

5 

6 
7 


28. 

Taslo  II 

Cumulative  Distribution  of  X for  values  of  k~4,  5;  6,  7. 

Kxsi  *) 


0 

i 

2 

3 

4 

5 

6 

n 

i 

0.0035 

0.0590 

0.3056 

0.6944 

0.9410 

0.9965 

1.0000 

1.0000 

— 

0.0012 

0.0172 

0.1052 

0.3392 

0.6608 

0.8943 

0.9828 

— 

— 

0.0001 

0.0020 

0.0166 

0.0627 

0.2010 

— 

__ 

„ 

- 

0.0001 

0.0009 

O.C323 

0.1103 

1 

! 


i 
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Table  111 

Standard  Error  of  the  Estimate  of  N and  Total  Effort  Required 
by  Various  Multiple  Inverse  Sample  Censuses. 


it 

a 

a 

*0 

- t — 

N = 10 
E(n’)* 

,000 

^ N 

N -- 

E(n*  )* 

100,000 

CTn 

8l» 

s2, 

8 3* 

V 

85 

k£i8l 

E(n’) 

E(n>: 

3 

& 

> > 

10, 

15, 

— 

— 

224 

0.202 

901 

2.2., 

3385 

O.oO 

3 

10. 

l'\ 

10, 

— 

— 

317 

to 

• 

923 

1.93 

4020 

0.46 

3 

15, 

10, 

5, 

— 

— 

388 

0.202 

963 

2.10 

4595 

0.44 

4 

5, 

10, 

15, 

20, 

— 

224 

0.162 

1131 

1.43 

3981 

C.4l 

4 

12, 

12, 

13, 

13, 

— 

347 

0.142 

1.155 

1.23 

4728 

0.30 

4 

23, 

15, 

10, 

5, 

— 

448 

0.162 

1208 

1.34 

5499 

0.29 

5 

5, 

10, 

15, 

20, 

25 

224 

0.  ] 35 

1363 

0.99 

4617 

0.29 

5 

15, 

15, 

15, 

15, 

15 

388 

0.116 

1394 

0.33 

5527 

0.21 

5 

25, 

20, 

15, 

10, 

5 

500 

0.135 

1452 

0.93 

6363 

0.21 

5 

U, 

13, 

15, 

17, 

19 

332 

0.118 

1377 

0.86 

517u 

0.23 

4 

18, 

19, 

19, 

19, 

— 

425 

0.116 

1418 

0.82 

5794 

0.20 

4 

7, 

15, 

23, 

30, 

— 

265 

0.134 

1389 

0.96 

4335 

0.28 

3 

25, 

25, 

25, 

— 

— 

500 

0.115 

1464 

0.79 

6373 

0.13 

3 

25, 

35, 

— 

— 

388 

0.122 

1425 

0.86 

5566 

0.22 

2 

37, 

38, 

— 

— 

— 

609 

0.M5 

1539 

0.75 

7248 

0.16 

2 

25, 

50, 

— 

— 

— 

500 

0.122 

1513 

0.81 

6406 

0.19 

1 

75, 

— 

— 

— 

— 

867 

0.114 

1732 

0.66 

95CS 

0.12 

* n>  = total  number  of  individuals  sampled  including  those  merited  or  t&ggod 
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